Fix explored nodes counter #570

nguidotti · 2025-11-06T14:45:40Z

This PR changes how the explored node counter is updated, such that it is only incremented after solving a node in the B&B.

Checklist

I am familiar with the Contributing Guidelines.
Testing
- New or existing tests cover these changes
- Added tests
- Created an issue to follow-up
- NA
Documentation
- The documentation is up to date with these changes
- Added new documentation
- NA

Summary by CodeRabbit

Refactor
- Improved progress reporting cadence and coordination in the branch-and-bound solver.
- More accurate and consistent node-counting in progress displays.
- Reduced noisy/duplicate log output and ensured coordinated reporting from solver start.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…the node

coderabbitai · 2025-11-06T14:45:59Z

📝 Walkthrough

Walkthrough

Node-counting and progress-logging were refactored: per-step counter updates were moved to occur around cutoff handling, and a new atomic flag should_report_ was added to gate and coordinate when logging occurs. Logs now compute/display metrics using local counters, then reset logging state.

Changes

Cohort / File(s)	Summary
Header member addition `cpp/src/dual_simplex/branch_and_bound.hpp`	Added `std::atomic<bool> should_report_` member (placed after solver status) for coordinated/report-gating behavior. No public API/signature changes.
Node counting & logging refactor `cpp/src/dual_simplex/branch_and_bound.cpp`	Moved updates of `nodes_explored`, `nodes_unexplored`, and `nodes_since_last_log` to occur around cutoff handling instead of per-step; introduced `should_report_` gating to control when verbose logging occurs; logging now computes gap/metrics from local counters, resets log state (nodes_since_last_log, last_log) and re-enables reporting; adjusted multiple paths (exploration_ramp_up, explore_subtree, solve initialization and ramp-by-thread) to use the new cadence and counter placement.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pay attention to thread-safety and atomic semantics around should_report_.
Verify that counter increments/decrements remain consistent across all cutoffs and early-return paths.
Confirm logging state reset (nodes_since_last_log, last_log) and timing matches intended reporting cadence.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix explored nodes counter' directly matches the PR's core objective of correcting how the explored node counter is updated in the branch-and-bound process.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9734677 and d499bea.

📒 Files selected for processing (2)

cpp/src/dual_simplex/branch_and_bound.cpp (8 hunks)
cpp/src/dual_simplex/branch_and_bound.hpp (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{cu,cuh,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cu,cuh,cpp,hpp,h}: Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events
Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks
Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations
For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle
Eliminate unnecessary host-device synchronization (cudaDeviceSynchronize) in hot paths that blocks GPU pipeline; use streams and events for async execution
Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse
Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems
Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)
Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state
Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication
Check that hard-coded GPU de...

Files:

cpp/src/dual_simplex/branch_and_bound.hpp
cpp/src/dual_simplex/branch_and_bound.cpp

**/*.{h,hpp,py}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Verify C API does not break ABI stability (no struct layout changes, field reordering); maintain backward compatibility in Python and server APIs with deprecation warnings

Files:

cpp/src/dual_simplex/branch_and_bound.hpp

**/*.{cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

**/*.{cpp,hpp,h}: Check for unclosed file handles when reading MPS/QPS problem files; ensure RAII patterns or proper cleanup in exception paths
Validate input sanitization to prevent buffer overflows and resource exhaustion attacks; avoid unsafe deserialization of problem files
Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Files:

cpp/src/dual_simplex/branch_and_bound.hpp
cpp/src/dual_simplex/branch_and_bound.cpp

**/*.{cu,cpp,hpp,h}

📄 CodeRabbit inference engine (.github/.coderabbit_review_guide.md)

Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Files:

cpp/src/dual_simplex/branch_and_bound.hpp
cpp/src/dual_simplex/branch_and_bound.cpp

🧠 Learnings (8)

📚 Learning: 2025-11-25T10:20:49.810Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.810Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results

Applied to files:

cpp/src/dual_simplex/branch_and_bound.hpp
cpp/src/dual_simplex/branch_and_bound.cpp

📚 Learning: 2025-11-25T10:20:49.810Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.810Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)

Applied to files:

cpp/src/dual_simplex/branch_and_bound.hpp
cpp/src/dual_simplex/branch_and_bound.cpp

📚 Learning: 2025-11-25T10:20:49.811Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.811Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms

Applied to files:

cpp/src/dual_simplex/branch_and_bound.hpp
cpp/src/dual_simplex/branch_and_bound.cpp

📚 Learning: 2025-11-25T10:20:49.811Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.811Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover

Applied to files:

cpp/src/dual_simplex/branch_and_bound.hpp
cpp/src/dual_simplex/branch_and_bound.cpp

📚 Learning: 2025-11-25T10:20:49.810Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.810Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations

Applied to files:

cpp/src/dual_simplex/branch_and_bound.cpp

📚 Learning: 2025-11-25T10:20:49.811Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.811Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Assess algorithmic complexity for large-scale problems (millions of variables/constraints); ensure O(n log n) or better complexity, not O(n²) or worse

Applied to files:

cpp/src/dual_simplex/branch_and_bound.cpp

📚 Learning: 2025-11-25T10:20:49.811Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.811Z
Learning: Applies to **/*.{cpp,hpp,h} : Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Applied to files:

cpp/src/dual_simplex/branch_and_bound.cpp

📚 Learning: 2025-11-25T10:20:49.811Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.811Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state

Applied to files:

cpp/src/dual_simplex/branch_and_bound.cpp

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: wheel-build-cuopt-sh-client / 13.0.2, 3.10, amd64, rockylinux8
GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, arm64, rockylinux8
GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, amd64, rockylinux8
GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, arm64, rockylinux8
GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, arm64, rockylinux8
GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.10, arm64, rockylinux8
GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.11, amd64, rockylinux8
GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.13, amd64, rockylinux8
GitHub Check: wheel-build-cuopt-mps-parser / 13.0.2, 3.12, amd64, rockylinux8
GitHub Check: checks / check-style

🔇 Additional comments (8)

cpp/src/dual_simplex/branch_and_bound.hpp (1)

162-163: LGTM: Coordinated logging gate added.

The atomic boolean should_report_ provides a centralized gate to coordinate logging cadence across parallel exploration paths, aligning with the updated counter tracking logic in the implementation file.

cpp/src/dual_simplex/branch_and_bound.cpp (7)

783-792: Correct cutoff handling: node not counted as explored.

The cutoff path now only decrements nodes_unexplored without incrementing nodes_explored, which is correct since the node is fathomed before solving. This aligns with the PR objective to increment the explored counter only after solving a node.

803-826: Logging gated by should_report_ with local counter snapshots.

The logging block uses should_report_.exchange(false) to serialize writes and displays local snapshots of counters. This ensures only one thread logs at a time while allowing other threads to continue updating global counters.

857-859: Core fix: counters updated after solving the node.

The explored and unexplored counters are now updated after solve_node returns (line 844-855), ensuring that nodes_explored is incremented only after the LP relaxation is actually solved. This is the primary fix for the PR.

920-924: Consistent cutoff handling in explore_subtree.

Consistent with exploration_ramp_up, the cutoff path only decrements nodes_unexplored without incrementing nodes_explored, correctly implementing the PR objective.

929-955: Single-threaded logging in explore_subtree via task_id check.

The logging block is protected by if (task_id == 0) ensuring only the master best-first thread logs in this phase, with local counter snapshots used for display. The barrier at line 1385 in solve() ensures no overlap with ramp-up phase logging.

982-984: Counters updated after solving in explore_subtree.

Consistent with exploration_ramp_up, the counters are updated after solve_node returns, correctly implementing the fix across both exploration phases.

1368-1368: Initialize should_report_ for startup logging.

Setting should_report_ to true enables immediate coordinated reporting when parallel exploration begins, allowing the first logging decision to proceed.

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

nguidotti · 2025-11-06T14:46:36Z

/ok to test bd697c9

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cpp/src/dual_simplex/branch_and_bound.cpp (1)
723-737: Use the current lower bound when logging

Here we print progress using user_lower = compute_user_objective(original_lp_, root_objective_). That root objective is the initial LP bound, so after the tree tightens the lower bound the reported gap never reflects it and the metric stays artificially wide. Please base user_lower (and the gap) on the current lower bound you already computed for this node.
diff --git a/cpp/src/dual_simplex/branch_and_bound.cpp b/cpp/src/dual_simplex/branch_and_bound.cpp
@@
-      f_t obj              = compute_user_objective(original_lp_, upper_bound);
-      f_t user_lower       = compute_user_objective(original_lp_, root_objective_);
+      f_t obj              = compute_user_objective(original_lp_, upper_bound);
+      f_t user_lower       = compute_user_objective(original_lp_, lower_bound);

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc49f7a and 9734677.

📒 Files selected for processing (2)

cpp/src/dual_simplex/branch_and_bound.cpp (8 hunks)
cpp/src/dual_simplex/branch_and_bound.hpp (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

cpp/src/dual_simplex/branch_and_bound.cpp (2)

cpp/src/dual_simplex/branch_and_bound.hpp (6)

node (89-94)

node (89-89)

node (96-103)

node (96-98)

search_tree (233-237)

search_tree (258-264)

cpp/src/dual_simplex/solve.cpp (6)

compute_user_objective (98-103)

compute_user_objective (98-98)

compute_user_objective (106-110)

compute_user_objective (106-106)

compute_user_objective (650-651)

compute_user_objective (653-653)

cpp/src/dual_simplex/branch_and_bound.cpp

chris-maes · 2025-11-08T00:28:34Z

cpp/src/dual_simplex/branch_and_bound.cpp

  if (lower_bound > upper_bound || rel_gap < settings_.relative_mip_gap_tol) {
    search_tree->graphviz_node(settings_.log, node, "cutoff", node->lower_bound);
    search_tree->update_tree(node, node_status_t::FATHOMED);
+    ++stats_.nodes_explored;


This is a bit strange. I think of nodes explored as the number of nodes we have solved, or perhaps proved infeasible through node presolve. If we are fathoming the node here I would not count it as explored.

chris-maes · 2025-11-08T00:29:29Z

cpp/src/dual_simplex/branch_and_bound.cpp

    search_tree->update_tree(node, node_status_t::FATHOMED);
+    ++stats_.nodes_explored;
+    --stats_.nodes_unexplored;
+    ++stats_.nodes_since_last_log;


Same here. The purpose of nodes since last log is for us to print every so often in a deterministic way. Here we did no work, so I don't think we want to increase this counter.

chris-maes · 2025-11-08T00:30:51Z

cpp/src/dual_simplex/branch_and_bound.cpp

-    if (stats_.nodes_explored.load() == nodes_explored) {
-      stats_.nodes_since_last_log = 0;
-      stats_.last_log             = tic();
+    bool should_report = should_report_.exchange(false);


Do we need to have member variable to decide if we should log? I think this is increasing the complexity of the code for little return.

Since this piece of code is running by multiple threads, the atomic here prevent multiple reports from happening (potentially out-of-order). But I agree that it is an imperfect solution. I will probably change when the ramp-up phase is reworked or the report is moved to a separated thread.

chris-maes · 2025-11-08T00:31:53Z

cpp/src/dual_simplex/branch_and_bound.cpp

    }
  }
+
+  ++stats_.nodes_explored;


Nit: is there a reason why all of these are ++x instead x++?

Why are we incrementing the nodes explored here at all?

chris-maes

Let's not count a node as explored if we trivially fathom it.

Let's talk offline about why you increment the nodes explored at the end of explore_subtree.

github-actions · 2025-11-15T09:06:21Z

🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

github-actions · 2025-11-25T09:09:04Z

🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

chris-maes · 2025-11-25T16:31:21Z

@nguidotti do you want this to go into the 25.12 release?

…counter

nguidotti · 2025-11-25T16:49:20Z

@nguidotti do you want this to go into the 25.12 release?

Since it is quite straightforward change, I think we can target the 25.12 release. I already update the code with your suggestions and now I am checking if it is working on the square41 and other MIPLIB instances.

nguidotti · 2025-11-25T17:07:53Z

Can you review again @chris-maes?

chris-maes

LGTM. Thanks for the fix.

nguidotti · 2025-11-26T09:31:27Z

/merge

nguidotti added 2 commits November 6, 2025 15:36

fixed the node exploration count, so it is only update after solving …

6b51863

…the node

add missing initialization

d110e4a

nguidotti requested a review from a team as a code owner November 6, 2025 14:45

nguidotti requested review from akifcorduk, chris-maes and rg20 and removed request for rg20 November 6, 2025 14:45

nguidotti added bug Something isn't working non-breaking Introduces a non-breaking change mip labels Nov 6, 2025

nguidotti added this to the 25.12 milestone Nov 6, 2025

Merge branch 'main' into fix-explored-counter

bd697c9

added missing counter

9734677

coderabbitai bot reviewed Nov 6, 2025

View reviewed changes

cpp/src/dual_simplex/branch_and_bound.cpp Outdated Show resolved Hide resolved

chris-maes reviewed Nov 8, 2025

View reviewed changes

chris-maes requested changes Nov 8, 2025

View reviewed changes

rgsl888prabhu changed the base branch from main to release/25.12 November 17, 2025 21:35

nguidotti mentioned this pull request Nov 20, 2025

Propagate the bounds from the parent to the child nodes #473

Merged

8 tasks

nguidotti added 2 commits November 25, 2025 17:39

Merge remote-tracking branch 'cuopt/release/25.12' into fix-explored-…

c25d295

…counter

merge 2

66fe769

updated counter based on reviewers feedback

d499bea

chris-maes approved these changes Nov 25, 2025

View reviewed changes

rapids-bot bot merged commit 51a6b35 into NVIDIA:release/25.12 Nov 26, 2025
355 of 363 checks passed

nguidotti deleted the fix-explored-counter branch November 26, 2025 09:31

Fix explored nodes counter #570

Fix explored nodes counter #570

Uh oh!

Conversation

nguidotti commented Nov 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

nguidotti commented Nov 6, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chris-maes Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

chris-maes Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

chris-maes Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

nguidotti Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

chris-maes Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

chris-maes Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

chris-maes left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 15, 2025

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

chris-maes commented Nov 25, 2025

Uh oh!

nguidotti commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nguidotti commented Nov 25, 2025

Uh oh!

chris-maes left a comment

Choose a reason for hiding this comment

Uh oh!

nguidotti commented Nov 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nguidotti commented Nov 6, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 6, 2025 •

edited

Loading

nguidotti commented Nov 25, 2025 •

edited

Loading